AITopics | linguistic instruction

Collaborating Authors

linguistic instruction

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Grounded Vision-Language Interpreter for Integrated Task and Motion Planning

Siburian, Jeremy, Shirai, Keisuke, Beltran-Hernandez, Cristian C., Hamaya, Masashi, Görner, Michael, Hashimoto, Atsushi

arXiv.org Artificial IntelligenceNov-5-2025

While recent advances in vision-language models have accelerated the development of language-guided robot planners, their black-box nature often lacks safety guarantees and interpretability crucial for real-world deployment. Conversely, classical symbolic planners offer rigorous safety verification but require significant expert knowledge for setup. To bridge the current gap, this paper proposes ViLaIn-TAMP, a hybrid planning framework for enabling verifiable, interpretable, and autonomous robot behaviors. ViLaIn-TAMP comprises three main components: (1) a Vision-Language Interpreter (ViLaIn) adapted from previous work that converts multimodal inputs into structured problem specifications, (2) a modular Task and Motion Planning (TAMP) system that grounds these specifications in actionable trajectory sequences through symbolic and geometric constraint reasoning, and (3) a corrective planning (CP) module which receives concrete feedback on failed solution attempts and feed them with constraints back to ViLaIn to refine the specification. We design challenging manipulation tasks in a cooking domain and evaluate our framework. Experimental results demonstrate that ViLaIn-TAMP outperforms a VLM-as-a-planner baseline by 18% in mean success rate, and that adding the CP module boosts mean success rate by 32%.

artificial intelligence, machine learning, vilain-t amp, (17 more...)

arXiv.org Artificial Intelligence

2506.0327

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Reflex-Based Open-Vocabulary Navigation without Prior Knowledge Using Omnidirectional Camera and Multiple Vision-Language Models

Kawaharazuka, Kento, Obinata, Yoshiki, Kanazawa, Naoaki, Tsukamoto, Naoto, Okada, Kei, Inaba, Masayuki

arXiv.org Artificial IntelligenceAug-21-2024

Various robot navigation methods have been developed, but they are mainly based on Simultaneous Localization and Mapping (SLAM), reinforcement learning, etc., which require prior map construction or learning. In this study, we consider the simplest method that does not require any map construction or learning, and execute open-vocabulary navigation of robots without any prior knowledge to do this. We applied an omnidirectional camera and pre-trained vision-language models to the robot. The omnidirectional camera provides a uniform view of the surroundings, thus eliminating the need for complicated exploratory behaviors including trajectory generation. By applying multiple pre-trained vision-language models to this omnidirectional image and incorporating reflective behaviors, we show that navigation becomes simple and does not require any prior setup. Interesting properties and limitations of our method are discussed based on experiments with the mobile robot Fetch.

instruction, navigation, robot, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1080/01691864.2024.2393409

2408.1138

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.36)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.35)

Add feedback

Daily Assistive View Control Learning of Low-Cost Low-Rigidity Robot via Large-Scale Vision-Language Model

Kawaharazuka, Kento, Kanazawa, Naoaki, Obinata, Yoshiki, Okada, Kei, Inaba, Masayuki

arXiv.org Artificial IntelligenceDec-12-2023

In this study, we develop a simple daily assistive robot that controls its own vision according to linguistic instructions. The robot performs several daily tasks such as recording a user's face, hands, or screen, and remotely capturing images of desired locations. To construct such a robot, we combine a pre-trained large-scale vision-language model with a low-cost low-rigidity robot arm. The correlation between the robot's physical and visual information is learned probabilistically using a neural network, and changes in the probability distribution based on changes in time and environment are considered by parametric bias, which is a learnable network input variable. We demonstrate the effectiveness of this learning method by open-vocabulary view control experiments with an actual robot arm, MyCobot.

experiment, linguistic instruction, robot, (13 more...)

arXiv.org Artificial Intelligence

2312.07451

Country:

Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.05)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.05)

Genre: Research Report (0.86)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)
Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.55)

Add feedback

Hierarchical Bayesian Model for the Transfer of Knowledge on Spatial Concepts based on Multimodal Information

Hagiwara, Yoshinobu, Taguchi, Keishiro, Ishibushi, Satoshi, Taniguchi, Akira, Taniguchi, Tadahiro

arXiv.org Artificial IntelligenceMar-10-2021

This paper proposes a hierarchical Bayesian model based on spatial concepts that enables a robot to transfer the knowledge of places from experienced environments to a new environment. The transfer of knowledge based on spatial concepts is modeled as the calculation process of the posterior distribution based on the observations obtained in each environment with the parameters of spatial concepts generalized to environments as prior knowledge. We conducted experiments to evaluate the generalization performance of spatial knowledge for general places such as kitchens and the adaptive performance of spatial knowledge for unique places such as `Emma's room' in a new environment. In the experiments, the accuracies of the proposed method and conventional methods were compared in the prediction task of location names from an image and a position, and the prediction task of positions from a location name. The experimental results demonstrated that the proposed method has a higher prediction accuracy of location names and positions than the conventional method owing to the transfer of knowledge.

information, location name, spatial concept, (16 more...)

arXiv.org Artificial Intelligence

2103.06442

Country:

Asia > Middle East > Jordan (0.04)
Asia > Japan > Honshū > Kansai > Osaka Prefecture > Osaka (0.04)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
(2 more...)

Add feedback